多项式方程系统经常在计算机视觉中产生,特别是在多视图几何问题中。用于解决这些系统的传统方法通常旨在消除变量达到单变量多项式,例如5点姿势估计的第十阶多项式,使用巧妙的操纵,或者更普遍使用Grobner基础,结果和消除模板,导致多视图几何和其他问题的成功算法。然而,当问题复杂时,这些方法不起作用,当他们这样做时,它们面临效率和稳定性问题。同型延续(HC)可以解决更复杂的问题而没有稳定性问题,并且保证全球解决方案,但已知它们是缓慢的。在本文中,我们表明HC可以在GPU上并行化,在多项式基准测试中显示出高达26倍的显着加速。我们还表明,GPU-HC可以在一系列计算机视觉问题上应用于一系列计算机视觉问题,包括具有未知焦距的4视图三角测量和三焦点姿态估计,其无法用消除模板解决,但它们可以用HC有效地解决它们。 GPU-HC打开门,以轻松配方和解决一系列计算机视觉问题。
translated by 谷歌翻译
We present a method for solving two minimal problems for relative camera pose estimation from three views, which are based on three view correspondences of i) three points and one line and the novel case of ii) three points and two lines through two of the points. These problems are too difficult to be efficiently solved by the state of the art Groebner basis methods. Our method is based on a new efficient homotopy continuation (HC) solver framework MINUS, which dramatically speeds up previous HC solving by specializing HC methods to generic cases of our problems. We characterize their number of solutions and show with simulated experiments that our solvers are numerically robust and stable under image noise, a key contribution given the borderline intractable degree of nonlinearity of trinocular constraints. We show in real experiments that i) SIFT feature location and orientation provide good enough point-and-line correspondences for three-view reconstruction and ii) that we can solve difficult cases with too few or too noisy tentative matches, where the state of the art structure from motion initialization fails.
translated by 谷歌翻译
We show for the first time that large-scale generative pretrained transformer (GPT) family models can be pruned to at least 50% sparsity in one-shot, without any retraining, at minimal loss of accuracy. This is achieved via a new pruning method called SparseGPT, specifically designed to work efficiently and accurately on massive GPT-family models. When executing SparseGPT on the largest available open-source models, OPT-175B and BLOOM-176B, we can reach 60% sparsity with negligible increase in perplexity: remarkably, more than 100 billion weights from these models can be ignored at inference time. SparseGPT generalizes to semi-structured (2:4 and 4:8) patterns, and is compatible with weight quantization approaches.
translated by 谷歌翻译
Neuromorphic systems require user-friendly software to support the design and optimization of experiments. In this work, we address this need by presenting our development of a machine learning-based modeling framework for the BrainScaleS-2 neuromorphic system. This work represents an improvement over previous efforts, which either focused on the matrix-multiplication mode of BrainScaleS-2 or lacked full automation. Our framework, called hxtorch.snn, enables the hardware-in-the-loop training of spiking neural networks within PyTorch, including support for auto differentiation in a fully-automated hardware experiment workflow. In addition, hxtorch.snn facilitates seamless transitions between emulating on hardware and simulating in software. We demonstrate the capabilities of hxtorch.snn on a classification task using the Yin-Yang dataset employing a gradient-based approach with surrogate gradients and densely sampled membrane observations from the BrainScaleS-2 hardware system.
translated by 谷歌翻译
A digital twin is defined as a virtual representation of a physical asset enabled through data and simulators for real-time prediction, optimization, monitoring, controlling, and improved decision-making. Unfortunately, the term remains vague and says little about its capability. Recently, the concept of capability level has been introduced to address this issue. Based on its capability, the concept states that a digital twin can be categorized on a scale from zero to five, referred to as standalone, descriptive, diagnostic, predictive, prescriptive, and autonomous, respectively. The current work introduces the concept in the context of the built environment. It demonstrates the concept by using a modern house as a use case. The house is equipped with an array of sensors that collect timeseries data regarding the internal state of the house. Together with physics-based and data-driven models, these data are used to develop digital twins at different capability levels demonstrated in virtual reality. The work, in addition to presenting a blueprint for developing digital twins, also provided future research directions to enhance the technology.
translated by 谷歌翻译
The concept of walkable urban development has gained increased attention due to its public health, economic, and environmental sustainability benefits. Unfortunately, land zoning and historic under-investment have resulted in spatial inequality in walkability and social inequality among residents. We tackle the problem of Walkability Optimization through the lens of combinatorial optimization. The task is to select locations in which additional amenities (e.g., grocery stores, schools, restaurants) can be allocated to improve resident access via walking while taking into account existing amenities and providing multiple options (e.g., for restaurants). To this end, we derive Mixed-Integer Linear Programming (MILP) and Constraint Programming (CP) models. Moreover, we show that the problem's objective function is submodular in special cases, which motivates an efficient greedy heuristic. We conduct a case study on 31 underserved neighborhoods in the City of Toronto, Canada. MILP finds the best solutions in most scenarios but does not scale well with network size. The greedy algorithm scales well and finds near-optimal solutions. Our empirical evaluation shows that neighbourhoods with low walkability have a great potential for transformation into pedestrian-friendly neighbourhoods by strategically placing new amenities. Allocating 3 additional grocery stores, schools, and restaurants can improve the "WalkScore" by more than 50 points (on a scale of 100) for 4 neighbourhoods and reduce the walking distances to amenities for 75% of all residential locations to 10 minutes for all amenity types. Our code and paper appendix are available at https://github.com/khalil-research/walkability.
translated by 谷歌翻译
As machine learning (ML) systems get adopted in more critical areas, it has become increasingly crucial to address the bias that could occur in these systems. Several fairness pre-processing algorithms are available to alleviate implicit biases during model training. These algorithms employ different concepts of fairness, often leading to conflicting strategies with consequential trade-offs between fairness and accuracy. In this work, we evaluate three popular fairness pre-processing algorithms and investigate the potential for combining all algorithms into a more robust pre-processing ensemble. We report on lessons learned that can help practitioners better select fairness algorithms for their models.
translated by 谷歌翻译
Federated Deep Learning frameworks can be used strategically to monitor Land Use locally and infer environmental impacts globally. Distributed data from across the world would be needed to build a global model for Land Use classification. The need for a Federated approach in this application domain would be to avoid transfer of data from distributed locations and save network bandwidth to reduce communication cost. We use a Federated UNet model for Semantic Segmentation of satellite and street view images. The novelty of the proposed architecture is the integration of Knowledge Distillation to reduce communication cost and response time. The accuracy obtained was above 95% and we also brought in a significant model compression to over 17 times and 62 times for street View and satellite images respectively. Our proposed framework has the potential to be a game-changer in real-time tracking of climate change across the planet.
translated by 谷歌翻译
We study the algorithm configuration (AC) problem, in which one seeks to find an optimal parameter configuration of a given target algorithm in an automated way. Recently, there has been significant progress in designing AC approaches that satisfy strong theoretical guarantees. However, a significant gap still remains between the practical performance of these approaches and state-of-the-art heuristic methods. To this end, we introduce AC-Band, a general approach for the AC problem based on multi-armed bandits that provides theoretical guarantees while exhibiting strong practical performance. We show that AC-Band requires significantly less computation time than other AC approaches providing theoretical guarantees while still yielding high-quality configurations.
translated by 谷歌翻译
Visual Question Answering (VQA) models often perform poorly on out-of-distribution data and struggle on domain generalization. Due to the multi-modal nature of this task, multiple factors of variation are intertwined, making generalization difficult to analyze. This motivates us to introduce a virtual benchmark, Super-CLEVR, where different factors in VQA domain shifts can be isolated in order that their effects can be studied independently. Four factors are considered: visual complexity, question redundancy, concept distribution and concept compositionality. With controllably generated data, Super-CLEVR enables us to test VQA methods in situations where the test data differs from the training data along each of these axes. We study four existing methods, including two neural symbolic methods NSCL and NSVQA, and two non-symbolic methods FiLM and mDETR; and our proposed method, probabilistic NSVQA (P-NSVQA), which extends NSVQA with uncertainty reasoning. P-NSVQA outperforms other methods on three of the four domain shift factors. Our results suggest that disentangling reasoning and perception, combined with probabilistic uncertainty, form a strong VQA model that is more robust to domain shifts. The dataset and code are released at https://github.com/Lizw14/Super-CLEVR.
translated by 谷歌翻译